Picture for Zhizheng Wu

Zhizheng Wu

EigeNet: Geometry-Informed Multi-Modal Learning for Few-shot Novel View RIR Prediction

Add code
May 27, 2026
Viaarxiv icon

VoxSafeBench: Not Just What Is Said, but Who, How, and Where

Add code
Apr 16, 2026
Viaarxiv icon

MimicLM: Zero-Shot Voice Imitation through Autoregressive Modeling of Pseudo-Parallel Speech Corpora

Add code
Apr 13, 2026
Viaarxiv icon

Grounding Sim-to-Real Generalization in Dexterous Manipulation: An Empirical Study with Vision-Language-Action Models

Add code
Mar 24, 2026
Viaarxiv icon

NV-Bench: Benchmark of Nonverbal Vocalization Synthesis for Expressive Text-to-Speech Generation

Add code
Mar 16, 2026
Viaarxiv icon

WhispEar: A Bi-directional Framework for Scaling Whispered Speech Conversion via Pseudo-Parallel Whisper Generation

Add code
Mar 09, 2026
Viaarxiv icon

Anatomy of the Modality Gap: Dissecting the Internal States of End-to-End Speech LLMs

Add code
Mar 02, 2026
Viaarxiv icon

VoxPrivacy: A Benchmark for Evaluating Interactional Privacy of Speech Language Models

Add code
Jan 27, 2026
Viaarxiv icon

Linear Script Representations in Speech Foundation Models Enable Zero-Shot Transliteration

Add code
Jan 06, 2026
Viaarxiv icon

Aliasing-Free Neural Audio Synthesis

Add code
Dec 23, 2025
Figure 1 for Aliasing-Free Neural Audio Synthesis
Figure 2 for Aliasing-Free Neural Audio Synthesis
Figure 3 for Aliasing-Free Neural Audio Synthesis
Figure 4 for Aliasing-Free Neural Audio Synthesis
Viaarxiv icon